Upsortable: Programming TopK Queries Over Data Streams
نویسندگان
چکیده
Top-k queries over data streams is a well studied problem. There exists numerous systems allowing to process continuous queries over sliding windows. At the opposite, nonappend only streams call for ad-hoc solutions, e.g. tailormade solutions implemented in a mainstream programming language. In the meantime, the Stream API and lambda expressions have been added in Java 8, thus gaining powerful operations for data stream processing. However, the Java Collections Framework does not provide data structures to safely and conveniently support sorted collections of evolving data. In this paper, we demonstrate Upsortable, an annotation-based approach that allows to use existing sorted collections from the standard Java API for dynamic data management. Our approach relies on a combination of pre-compilation abstract syntax tree modifications and runtime analysis of bytecode. Upsortable offers the developer a safe and time-efficient solution for developing top-k queries on data streams while keeping a full compatibility with standard Java.
منابع مشابه
ارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملReporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries
Reverse topk queries are proposed from the perspective of a product manufacturer, which are essential for manufacturers to assess the potential market. However, the existing approaches for reverse topk queries are all based on the assumption that the underlying data are exact (or certain). Due to the intrinsic differences between uncertain and certain data, these methods cannot be applied to pr...
متن کاملTop-k Dominating Queries: a Survey
Top-k dominating queries combine the advantages of top-k queries and skyline queries, and eliminate their disadvantages. They return k objects with the highest domination score, which is defined as the number of dominated objects. As a top-k query, the user can bound the number of returned results through the parameter k, and like a skyline query a user-selected scoring function is not required...
متن کاملCrowdK: Answering top-k queries with crowdsourcing
In recent years, crowdsourcing has emerged as a new computing paradigm for bridging the gap between humanand machine-based computation. As one of the core operations in data retrieval, we study topk queries with crowdsourcing, namely crowd-enabled topk queries . This problem is formulated with three key factors, latency, monetary cost , and quality of answers . We first aim to design a novel fr...
متن کاملHandling ER-topk Query on Uncertain Streams
Data uncertainty widely exists in many applications. In this paper, we aim at handling top-k queries on uncertain data streams. Since the volume of a data stream is unbounded whereas the memory resource is limited, it is critical to devise one-pass solutions that is both timeand space efficient. In this paper, we use two structures to handle this issue. The DomGraph stores all tuples that are p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 10 شماره
صفحات -
تاریخ انتشار 2017